Using EASI Sentinel-1 RTC Gamma0 data¶
This notebook demonstrates how to load and use Sentinel-1 Radiometric Terrain Corrected (RTC) Gamma0 data generated in EASI.
The processing uses SNAP-10 with a graph processing tool (GPT) xml receipe for RTC Gamma0 and its variants.
For most uses we recommend the smoothed 20 m product (sentinel1_grd_gamma0_20m).
We can process the 10 m products (sentinel1_grd_gamma0_10m, sentinel1_grd_gamma0_10m_unsmooth) on request. Please also ask if you wish to trial other combinations of the parameters.
RTC Gamma0 product variants¶
| sentinel1_grd_gamma0_20m | sentinel1_grd_gamma0_10m | sentinel1_grd_gamma0_10m_unsmooth | |
|---|---|---|---|
| DEM | |||
| copernicus_dem_30 | Y | Y | Y |
| Scene to DEM extent multiplier | 3.0 | 3.0 | 3.0 |
| SNAP operator | |||
| Apply-Orbit-File | Y | Y | Y |
| ThermalNoiseRemoval | Y | Y | Y |
| Remove-GRD-Border-Noise | Y | Y | Y |
| Calibration | Y | Y | Y |
| SetNoDataValue | Y | Y | Y |
| Terrain-Flattening | Y | Y | Y |
| Speckle-Filter | Y | Y | N |
| Multilook | Y | Y | N |
| Terrain-Correction | Y | Y | Y |
| Output | |||
| Projection | WGS84, epsg:4326 | WGS84, epsg:4326 | WGS84, epsg:4326 |
| Pixel resolution | 20 m | 10 m | 10 m |
| Pixel alignmentPixelIsArea = top-left | PixelIsArea | PixelIsArea | PixelIsArea |
Units and conversions¶
The sentinel1_grd_gamma0_* data are in Intensity units. Intensity can be converted to dB and amplitude, and vice-versa, with the following.
Practical Xarray examples are given below.
Intensity to/from dB:
dB = 10*log10(intensity)
intensity = 10**(dB/10)
Intensity to/from Amplitude:
intensity = amplitude * amplitude
amplitude = sqrt(intensity)
In this notebook we have two functions for xarray datasets/arrays, using numpy.
def intensity_to_db(x):
return 10*numpy.log10(x)
def db_to_intensity(db):
return numpy.pow(10, db/10.0)
Reference: https://forum.step.esa.int/t/what-stage-of-processing-requires-the-linear-to-from-db-command
# Basic plots
%matplotlib inline
# import matplotlib.pyplot as plt
# plt.rcParams['figure.figsize'] = [12, 8]
# Common imports and settings
import os, sys, re
from pathlib import Path
from IPython.display import Markdown
import pandas as pd
pd.set_option("display.max_rows", None)
import xarray as xr
import numpy as np
# Datacube
import datacube
from datacube.utils.aws import configure_s3_access
import odc.geo.xr # https://github.com/opendatacube/odc-geo
from datacube.utils import masking # https://github.com/opendatacube/datacube-core/blob/develop/datacube/utils/masking.py
from dea_tools.plotting import display_map, rgb # https://github.com/GeoscienceAustralia/dea-notebooks/tree/develop/Tools
# EASI tools
import git
repo = git.Repo('.', search_parent_directories=True).working_tree_dir # This gets the current repo directory. Alternatively replace with the easi-notebooks repo path in your home directory
if repo not in sys.path: sys.path.append(repo)
from easi_tools import EasiDefaults, xarray_object_size
from easi_tools.notebook_utils import mostcommon_crs, initialize_dask, localcluster_dashboard, heading
# Holoviews
import hvplot.xarray
import cartopy.crs as ccrs
# EASI defaults
# These are convenience functions so that the notebooks in this repository work in all EASI deployments
# The `git.Repo()` part returns the local directory that easi-notebooks has been cloned into
# If using the `easi-tools` functions from another path, replace `repo` with your local path to `easi-notebooks` directory
import git
repo = git.Repo('.', search_parent_directories=True).working_tree_dir # Path to this cloned local directory
if repo not in sys.path: sys.path.append(repo) # Add the local path to `easi-notebooks` to python
from easi_tools import EasiDefaults
from easi_tools import initialize_dask, xarray_object_size, mostcommon_crs, heading
EASI defaults¶
These default values are configured for each EASI instance. They help us to use the same training notebooks in each EASI instance. You may find some of the functions convenient for your work or you can easily override the values in your copy of this notebook.
easi = EasiDefaults()
family = 'sentinel-1'
product = easi.product(family)
# product = 'sentinel1_grd_gamma0_20m'
display(Markdown(f'Default {family} product for "{easi.name}": [{product}]({easi.explorer}/products/{product})'))
Successfully found configuration for deployment "asia"
Default sentinel-1 product for "asia": sentinel1_grd_gamma0_20m
Dask cluster¶
Its nearly always worth starting a dask cluster as it can improve data load and processing speed.
# Local cluster
cluster, client = initialize_dask(workers=4)
display(client)
# Or use Dask Gateway - this may take a few minutes
# cluster, client = initialize_dask(use_gateway=True, workers=4)
# display(client)
Successfully found configuration for deployment "asia"
Client
Client-53e0ade1-a7c8-11ef-8b60-12700b876f10
| Connection method: Cluster object | Cluster type: distributed.LocalCluster |
| Dashboard: https://hub.asia.easi-eo.solutions/user/pag064/proxy/8787/status |
Cluster Info
LocalCluster
f1163724
| Dashboard: https://hub.asia.easi-eo.solutions/user/pag064/proxy/8787/status | Workers: 4 |
| Total threads: 8 | Total memory: 24.00 GiB |
| Status: running | Using processes: True |
Scheduler Info
Scheduler
Scheduler-5665ae16-36e0-4586-9963-3e677021cfd8
| Comm: tcp://127.0.0.1:33215 | Workers: 4 |
| Dashboard: https://hub.asia.easi-eo.solutions/user/pag064/proxy/8787/status | Total threads: 8 |
| Started: Just now | Total memory: 24.00 GiB |
Workers
Worker: 0
| Comm: tcp://127.0.0.1:40535 | Total threads: 2 |
| Dashboard: https://hub.asia.easi-eo.solutions/user/pag064/proxy/33005/status | Memory: 6.00 GiB |
| Nanny: tcp://127.0.0.1:42537 | |
| Local directory: /tmp/dask-scratch-space/worker-azxtxypn | |
Worker: 1
| Comm: tcp://127.0.0.1:45457 | Total threads: 2 |
| Dashboard: https://hub.asia.easi-eo.solutions/user/pag064/proxy/41063/status | Memory: 6.00 GiB |
| Nanny: tcp://127.0.0.1:36155 | |
| Local directory: /tmp/dask-scratch-space/worker-9g37dpbf | |
Worker: 2
| Comm: tcp://127.0.0.1:36181 | Total threads: 2 |
| Dashboard: https://hub.asia.easi-eo.solutions/user/pag064/proxy/37799/status | Memory: 6.00 GiB |
| Nanny: tcp://127.0.0.1:42667 | |
| Local directory: /tmp/dask-scratch-space/worker-9ejixcs2 | |
Worker: 3
| Comm: tcp://127.0.0.1:39291 | Total threads: 2 |
| Dashboard: https://hub.asia.easi-eo.solutions/user/pag064/proxy/39757/status | Memory: 6.00 GiB |
| Nanny: tcp://127.0.0.1:38759 | |
| Local directory: /tmp/dask-scratch-space/worker-tvw0sbaj | |
ODC database¶
Connect to the ODC database. The EASI Sentinel-1 data are produced and stored in a local bucket, so the configure_s3_access() function is not required.
dc = datacube.Datacube()
# Access AWS "requester-pays" buckets
# This is necessary for reading data from most third-party AWS S3 buckets such as for Landsat and Sentinel-2
configure_s3_access(aws_unsigned=False, requester_pays=True, client=client);
Example query¶
Change any of the parameters in the query object below to adjust the location, time, projection, or spatial resolution of the returned datasets.
Use the Explorer interface to check the temporal and spatial coverage for each product.
# Explorer link
display(Markdown(f'See: {easi.explorer}/products/{product}'))
# EASI defaults
display(Markdown(f'#### Location: {easi.location}'))
latitude_range = easi.latitude
longitude_range = easi.longitude
time_range = easi.time
# Or set your own latitude / longitude
# Australia GWW
# latitude_range = (-33, -32.6)
# longitude_range = (120.5, 121)
# time_range = ('2020-01-01', '2020-01-31')
# Example: PNG
latitude_range = (-4.26, -3.75)
longitude_range = (144.03, 144.74)
time_range = ('2020-01-01', '2020-05-31')
# Bangladesh
# latitude_range = (21.5, 23.5)
# longitude_range = (89, 90.5)
# time_range = ('2024-05-01', '2024-06-10')
# Vietnam
# (optional target projection = epsg:32648)
# latitude_range = (9.1, 9.9)
# longitude_range = (105.6, 106.4)
# time_range = ('2024-01-01', '2024-09-10')
query = {
'product': product, # Product name
'x': longitude_range, # "x" axis bounds
'y': latitude_range, # "y" axis bounds
'time': time_range, # Any parsable date strings
}
# Convenience function to display the selected area of interest
display_map(longitude_range, latitude_range)
Location: Lake Tempe, Indonesia¶
Load data¶
# Target xarray parameters
# - Select a set of measurements to load
# - output CRS and resolution
# - Usually we group input scenes on the same day to a single time layer (groupby)
# - Select a reasonable Dask chunk size (this should be adjusted depending on the
# spatial and resolution parameters you choose
load_params = {
'dask_chunks': {'latitude':2048, 'longitude':2048}, # Dask chunk size
'group_by': 'solar_day', # Group by day method
}
# Load data
data = dc.load(**(query | load_params))
display(xarray_object_size(data))
display(data)
'Dataset size: 5.16 GB'
<xarray.Dataset> Size: 6GB
Dimensions: (time: 51, latitude: 2550, longitude: 3550)
Coordinates:
* time (time) datetime64[ns] 408B 2020-01-03T08:39:20 ... 2020-05-3...
* latitude (latitude) float64 20kB -3.75 -3.75 -3.751 ... -4.26 -4.26
* longitude (longitude) float64 28kB 144.0 144.0 144.0 ... 144.7 144.7
spatial_ref int32 4B 4326
Data variables:
vh (time, latitude, longitude) float32 2GB dask.array<chunksize=(1, 2048, 2048), meta=np.ndarray>
vv (time, latitude, longitude) float32 2GB dask.array<chunksize=(1, 2048, 2048), meta=np.ndarray>
angle (time, latitude, longitude) float32 2GB dask.array<chunksize=(1, 2048, 2048), meta=np.ndarray>
Attributes:
crs: EPSG:4326
grid_mapping: spatial_ref# When happy with the shape and size of chunks, persist() the result
data = data.persist()
Conversion and helper functions¶
# These functions use numpy, which should be satisfactory for most notebooks.
# Calculations for larger or more complex arrays may require Xarray's "ufunc" capability.
# https://docs.xarray.dev/en/stable/examples/apply_ufunc_vectorize_1d.html
#
# Apply numpy.log10 to the DataArray
# log10_data = xr.apply_ufunc(np.log10, data)
def intensity_to_db(da: 'xr.DataArray'):
"""Return an array converted to dB values"""
xx = da.where(da > 0, np.nan) # Set values <= 0 to NaN
xx = 10*np.log10(xx)
xx.attrs.update({"units": "dB"})
return xx
def db_to_intensity(da: 'xr.DataArray'):
"""Return an array converted to intensity values"""
xx = np.power(10, da/10.0)
xx.attrs.update({"units": "intensity"})
return xx
def make_image(ds: 'xarray', frame_height=300, **kwargs):
"""Return a Holoviews object that can be displayed or combined"""
spatial_dims = ds.odc.spatial_dims
defaults = dict(
cmap="Greys_r",
y = spatial_dims[0], x = spatial_dims[1],
rasterize = True,
geo = True,
frame_height = frame_height,
clabel = ds.attrs.get('units', None),
)
defaults.update(**kwargs)
return ds.hvplot.image(**defaults)
def select_valid_time_layers(ds: 'xarray', percent: float = 5):
"""Select time layers that have at least a given percentage of valid data (e.g., >=5%)
Example usage:
selected = select_valid_time_layers(ds, percent=5)
filtered == ds.sel(time=selected)
"""
spatial_dims = ds.odc.spatial_dims
return ds.count(dim=spatial_dims).values / (ds.sizes[spatial_dims[0]]*ds.sizes[spatial_dims[1]]) >= (percent/100.0)
# Examples to check that the intensity to/from dB functions work as expected
# xx = data.vv.isel(time=0,latitude=np.arange(0, 5),longitude=np.arange(0, 5))
# xx[0] = 0
# xx[1] = -0.001
# display(xx.values)
# yy = intensity_to_db(xx)
# display(yy.values)
# zz = db_to_intensity(yy)
# display(zz.values)
# Optional time layer filter
selected = select_valid_time_layers(data.vv, 10)
data = data.sel(time=selected).persist()
/env/lib/python3.12/site-packages/rasterio/warp.py:387: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned. dest = _reproject( /env/lib/python3.12/site-packages/rasterio/warp.py:387: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned. dest = _reproject( /env/lib/python3.12/site-packages/rasterio/warp.py:387: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned. dest = _reproject( /env/lib/python3.12/site-packages/rasterio/warp.py:387: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned. dest = _reproject(
# Add db values to the dataset
data['vh_db'] = intensity_to_db(data.vh).persist()
data['vv_db'] = intensity_to_db(data.vv).persist()
Plot the data¶
Note the different data ranges for plotting (
clim) betweenvv,vh, intensity and dB.
The two polarisations will tend to discriminate different features or characteristics in the landscape, such as flooded areas or vegeation structure.
Intensity
# A single time layer for VV and VH, with linked axes
idx = 0
time_label = data.time.dt.strftime("%Y-%m-%d %H:%M:%S").values[idx]
vvplot = make_image(data.vv.isel(time=idx), clim=(0, 0.5), title=f'VV ({time_label})')
vhplot = make_image(data.vh.isel(time=idx), clim=(0, 0.1), title=f'VH ({time_label})')
vvplot + vhplot
# Make a dB plot
vvplot = make_image(data.vv_db.isel(time=idx), clim=(-30, -3), title=f'VV ({time_label})')
vhplot = make_image(data.vh_db.isel(time=idx), clim=(-30, -1), title=f'VH ({time_label})')
vvplot + vhplot
# Subplots for each time layer for VV, with linked axes
make_image(data.vh_db, clim=(-30, -3), robust=True).layout().cols(4)